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Abstract 

Energy harvesting sensor nodes are gaining popularity due to their ability to improve the network life time 
and are becoming a preferred choice supporting "green communication". In this paper we focus on communicating 
reliably over an AWGN channel using such an energy harvesting sensor node. An important part of this work 
involves appropriate modeling of the energy harvesting, as done via various practical architectures. Our main result 
is the characterization of the Shannon capacity of the communication system. The key technical challenge involves 
dealing with the dynamic (and stochastic) nature of the (quadratic) cost of the input to the channel. As a corollary, 
we find close connections between the capacity achieving energy management policies and the queueing theoretic 
throughput optimal policies. 

Keywords: Information capacity, energy harvesting, sensor networks, fading channel, energy buffer, 
network life time. 

I. Introduction 

Sensor nodes are often deployed for monitoring a random field. These nodes are characterized by 
limited battery power, computational resources and storage space. Once deployed, the battery of these 
nodes are often not changed because of the inaccessibility of these nodes. Nodes could possibly use larger 
batteries but with increased weight, volume and cost. Hence when the battery of a node is exhausted, it 
is not replaced and the node dies. When sufficient number of nodes die, the network may not be able 
to perform its designated task. Thus the life time of a network is an important characteristic of a sensor 
network ([1]) and it depends on the life time of a node. 
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The network life time can be improved by reducing the energy intensive tasks, e.g., reducing the number 
of bits to transmit ([0, ), making a node to go into power saving modes (sleep/listen) periodically ([4J), 
using energy efficient routing (flU, flU), adaptive sensing rates and multiple access channel ([7]). Network 
life time can also be increased by suitable architectural choices like the tiered system ((HI) and redundant 
placement of nodes (||9l). 

Recently new techniques of increasing network life time by increasing the life time of the battery is 
gaining popularity. This is made possible by energy harvesting techniques (flU, ifTTlO . Energy harvestesr 
harness energy from the environment or other energy sources ( e.g., body heat) and convert them to 
electrical energy. Common energy harvesting devices are solar cells, wind turbines and piezo-electric 
cells, which extract energy from the environment. Among these, harvesting solar energy through photo- 
voltaic effect seems to have emerged as a technology of choice for many sensor nodes ( IfTTTl . lfT2l ). Unlike 
for a battery operated sensor node, now there is potentially an infinite amount of energy available to the 
node. However, the source of energy and the energy harvesting device may be such that the energy 
cannot be generated at all times (e.g., a solar cell). Furthermore the rate of generation of energy can be 
limited. Thus one may want to match the energy generation profile of the harvesting source with the 
energy consumption profile of the sensor node. If the energy can be stored in the sensor node then this 
matching can be considerably simplified. But the energy storage device may have limited capacity. The 
energy consumption policy should be designed in such a way that the node can perform satisfactorily for 
a long time, i.e., energy starvation at least, should not be the reason for the node to die. In IITOl such an 
energy/power management scheme is called energy neutral operation. 

In the following we survey the relevant literature. Early papers on energy harvesting in sensor networks 
are [fT3ll and [[T4|. A practical solar energy harvesting sensor node prototype is described in [fT5ll . In 
IfTOll various deterministic models for energy generation and energy consumption profiles are studied and 
provides conditions for energy neutral operation. In [fT6l a sensor node is considered which is sensing 
certain interesting events. The authors study optimal sleep- wake cycles such that event detection probability 
is maximized. A recent survey on energy harvesting is ifTTl . 

Energy harvesting can be often divided into two major architectures ([15]). In Harvest-use(HU), the 
harvesting system directly powers the sensor node and when sufficient energy is not available the node is 
disabled. In Harvest-Store-Use (HSU) there is a storage device that stores the harvested energy and also 
powers the sensor node. The storage can be single or double staged ( IfTOll . El). 



Various throughput and delay optimal energy management policies for energy harvesting sensor nodes 
are provided in [fT8l . The energy management policies in |fT8l are extended in various directions in [fT9ll 
and ll20l . For example, |fl9l also provides some efficient MAC policies for energy harvesting nodes. 
In ll20ll optimal sleep- wake policies are obtained for such nodes. Furthermore, [EH considers jointly 
optimal routing, scheduling and power control policies for networks of energy harvesting nodes. Energy 
management policies for finite data and energy buffer are provided in [[22] . Reference [23 J provides optimal 
energy management policies and energy allocation over source acquisition/compression and transmission. 

In a recent contribution, optimal energy allocation policies over a finite horizon and fading channels 
are studied in Il24l . Relevant literature for models combining information theory and queuing theory are 
[[251 and J2fQ. 

The capacity of a fading Gaussian channel with channel state information (CSI) at the transmitter and 
receiver and at the receiver alone are provided in ll2~7l . It was shown that optimal power adaptation when 
CSI is available both at the transmitter and the receiver is 'water filling' in time. 

Information-theoretic capacity of an energy harvesting system has been considered previously in [|28ll 
and ||29ll independently. It was shown that the capacity of the energy harvesting AWGN channel with an 
unlimited battery is equal to the capacity with an average power constraint equal to average recharge rate. 
In ll2~9ll the proof technique used is based on AMS sequences [30J which is different from that used in 

Our main contributions are in considering significant extensions to the basic energy harvesting system 
by considering processor energy, energy inefficiencies (and finally channel fading). We compute the 
capacity when the energy is consumed in other activities at the node (e.g., processing, sensing, etc) than 
transmission. This issue of energy consumed in processing in the context of the usual AWGN channel 
(i.e., without energy harvesters) is addressed in [31]. Finally we provide the achievable rates when there 
are storage inefficiencies. We show that the throughput optimal policies provided in [TT8l are related to 
the capacity achieving policies provided here. We also extend the results to a scenario with fast fading. 
Further we combine the information theoretic and queueing-theoretic models for the above scenarios. 
Finally, we provide achievable rates when the nodes have finite buffer to store the harvested energy. Our 
results can be useful in the context of green communication ( 021 . 11331 ) when solar and/or wind energy 
can be used by a base station ([34]). 

System level power consumption in wireless systems including energy expended in decoding is provided 



in 113311 . Related literature for conserving energy but without energy harvester is [361- H37I . In 0611 an 
explicit model for power consumption at an idealized decoder is studied. Optimal constellation size for 
uncoded transmission subject to peak power constraint is given in 0~8l . Reference D~71l characterizes the 
capacity when the transmitter and the receiver probe the state of the channel. The probing action is cost 
constrained. 

The paper is organized as follows. Section [XT] describes the system model. Section III provides the 
capacity for a single node under idealistic assumptions. Section [IV] takes into account the energy spent on 
sensing, computation etc. and proposes capacity achieving sleep-wake schemes. Section [V] obtains efficient 



policies with inefficiencies in the energy storage system. Section VI studies the capacity of the energy 
harvesting system transmitting over a fading AWGN channel. Section VII combines the information- 
theoretic and queueing-theoretic formulations. Section VIII provides achievable rates for the practically 
interesting case of finite buffer. Section [IX] concludes the paper. 



II. Model and notation 
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Fig. 1. The model 



In this section we present our model for a single energy harvesting sensor node.We consider a sensor 
node (Fig. [T]) which is sensing and generating data to be transmitted to a central node via a discrete time 
AWGN channel. We assume that transmission consumes most of the energy in a sensor node and ignore 
other causes of energy consumption (this is true for many low quality, low rate sensor nodes (|fT2|V). This 



assumption will be removed in Section IV The sensor node is able to replenish energy by Y k at time 
k. The energy available in the node at time k is E k . This energy is stored in an energy buffer with an 
infinite capacity. In this section the fading effects are not considered; however this issue is addressed in 
Section I 



The node uses energy T k at time k which depends on E k and T k < E k . The process {E k } satisfies 



We will assume that {Y k } is stationary ergodic. This assumption is general enough to cover most 
of the stochastic models developed for energy harvesting. Often the energy harvesting process will be 
time varying (e.g., solar cell energy harvesting will depend on the time of day). Such a process can be 
approximated by piecewise stationary processes. As in [fT8l . we can indeed consider {Y k } to be periodic, 
stationary ergodic. 

The encoder receives a message S from the node and generates an n-length codeword to be transmitted 
on the AWGN channel. The channel output W k = X k + N k where X k is the channel input at time k 
and N k is independent, identically distributed (iid) Gaussian noise with zero mean and variance a 2 (we 
denote the corresponding Gaussian density by J\f(0,a 2 )). The decoder receives W n = (W 1 ,...,W n ) and 
reconstructs 5* such that the probability of decoding error is minimized. 

We will obtain the information-theoretic capacity of this channel. This of course assumes that there is 
always data to be sent at the sensor node (this assumption will be removed in section VII). This channel 
is essentially different from the usually studied systems in the sense that the transmit power and coding 
scheme can depend on the energy available in the energy buffer at that time. 

A possible generalization of our model is that the energy E k changes at a slower time scale than 
a channel symbol transmission time, i.e., in equation ([T]) k represents a time slot which consists of m 
channel uses, m > 1. We comment on this generalization in Section III (see also Section VII). 



In this section we obtain the capacity of the channel with an energy harvesting node under ideal 
conditions of infinite energy buffer and energy consumption in transmission only. 

The system starts at time k = with an empty energy buffer and E k evolves with time depending on 
Y k and T k . Thus {E k , k > 0} is not stationary and hence {T k } may also not be stationary. In this setup, 
a reasonable general assumption is to expect {T k } to be asymptotically stationary. Indeed we will see that 
it will be sufficient for our purposes. These sequences are a subset of Asymptotically Mean Stationary 
(AMS) sequences , i.e., sequences {T k } such that 



E k+ i = {E k — T k ) + Y k . 



(1) 



III. Capacity for the Ideal System 




(2) 



k=l 



exists for all measurable A. In that case P is also a probability measure and is called the stationary mean 
of the AMS sequence (QUI'). 

If the input {X k } is AMS and ergodic, then it can be easily shown that for the AWGN channel 
{(X k ,W k ), k > 0} is also AMS an ergodic. In the following theorem we will show that the channel 
capacity of our system is ( QUI ') 

C = sup I(X; W) = sup lim sup -I(X n , W n ) (3) 

Px Px n— >oo Tl 

where {X n } is an AMS sequence, X n = (Xi, ...,X n ) and the supremum is over all possible AMS 
sequences {X n }. In other words, one can find a sequence of codewords with code length n and rate R 
such that the average probability of error goes to zero as n — > oo if and only of R < C. 

Theorem 1: For the energy harvesting system, the capacity C = 0.5 log(l + ^P)- 

Proof: See Appendix A. This result has also appeared in [|28l . The achievability proofs are somewhat 
different (both the scheme itself as well as the technical approach to the proof). 

Thus we see that the capacity of this channel is the same as that of a node with average energy 
constraint E[Y], i.e., the hard energy constraint of E k at time k does not affect its capacity. The capacity 
achieving signaling in the above theorem is truncated iid Gaussian with zero mean and variance E[Y] 
where the truncation occurs due to the energy limitation E k at time k. The same capacity is obtained for 
any other initial energy E (because then also our signaling scheme leads to an AMS sequence with the 
same stationary mean). 

The scenario when there is no energy buffer to store the harvested energy (Harvest-Use) was studied 
extensively in 11391 , which calculated the capacity to be C = max Pi I(X; W) < 0.5 log(l + E[Y]/a 2 ). 
We mention this result in some detail (and variations) since this material will be used in developing later 
sections. The last inequality is strict unless X k is J\f(0, E[Y]) and Y k is also known at the receiver at 
time k. Then X 2 = Y and hence Y k is chi-square distributed with degree 1. If Y k = E[Y] then the 
capacity will be that of an AWGN channel with peak and average power constraint = E[Y). This problem 
is addressed in [40|, PTI . Il42l and the capacity achieving distribution is finite and discrete. Let X(y) 
denote a random variable having distribution that achieves capacity with peak power y. Then, for the case 
when information about Y k is also available at the decoder at time k, the capacity of the channel when 
{>fc}fc>i is iid is 



C = E Y [I(X(Y);W)}. 



(4) 



For small y, X 2 (y) = y. This result can be extended to the case when {Y k } is stationary ergodic. Then 
the right side of Q will be replaced by the information rate of {X k (Y k ), W k }- In conclusion, having some 
energy buffer to store the harvested energy almost always strictly increases the capacity of the system 
(under ideal conditions of this section). 

Next we extend this result to the case when only partial information about Y k is available at the encoder 
and the decoder at time k (causally). The interesting case of Y k information being perfectly available at 
the encoder and not at the decoder is a special case of this set up. The channel is given in Fig. [2] where 
V k and denote the partial information about Y k at the encoder and the decoder respectively. For 
simplicity, we will assume {Y k } to be iid. The capacity of this channel can be obtained from the capacity 
of a state dependent channel with partial state information at the encoder and the decoder (031): 
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Fig. 2. Capacity with no buffer and partial energy harvesting information 



C= sup I(T; W\V {r) ), (5) 

Pt(-) 

where the supremum is over distributions -Pr(-) of continuous functions, T 6 T, T : V® — > X where 
and X denote the sets in which Y and X take values. T denotes the set of all |Af|l v<t) l functions 
from lA*' to X. Also, T is independent of and V^ r >. The capacity when Y k is not available at the 
decoder but perfectly known at the encoder is obtained by substituting vjf 1 = Y k and V k ^ = in ([5]). 

In [[181, a system with a data buffer at the node which stores data sensed by the node before 
transmitting it, is considered. The stability region (for the data buffer) for the 'no-buffer' and 'infinite- 
buffer' corresponding to the harvest-use and harvest-store-use architectures are provided. The throughput 
optimal policies in [fT8l are T n = min(_E n ; E[Y] — e) for the infinite energy buffer and T n = Y n when 



there is no energy buffer. Hence we see that the Shannon capacity achieving energy management policies 
provided here are close to the throughput optimal policies in [fT8l . Also the capacity is the same as the 
maximum throughput obtained in the data-buffer case in lfT8l for the infinite buffer architecture. In section 
VII we will connect further this model with our information theoretic model studied above. 

Above we considered the cases when there is infinite energy buffer or when there is no buffer at all. 
However, in practice often there is a finite energy buffer to store. This case is considered in Section VIII 
and we provide achievable rates. 

Next we comment on the capacity results when ([T]) represents E k+ i at the end of the A;th slot where 
a slot represents m channel uses. In this case energy E k is available not for one channel use but for m 
channel uses. This relaxes our energy constraints. Thus if E[Y] still denotes mean energy harvested per 
channel use, then for infinite buffer case the capacity remains same as in Theorem 1. 

IV. Capacity with Processor Energy (PE) 

Till now we have assumed that all the energy that a node consumes is for transmission. However, 
sensing, processing and receiving (from other nodes) also require significant energy, especially in recent 
higher-end sensor nodes (" Ifl2l0 . We will now include the energy consumed by sensing and processing 
only. 

We assume that energy Z k is consumed by the node (if E k > Z k ) for sensing and processing at time 
instant k. Thus, for transmission at time k, only E k — Z k is available. {Z k } is assumed a stationary, 
ergodic sequence. The rest of the system is as in Section II. 



First we extend the achievable policy in Section III to incorporate this case. The signaling scheme 



X k = sgn(X' k ) min(A/Efc, \X' k \) where {X' k } is iid Gaussian with zero mean and variance E[Y] — E[Z] — e 
achieves the rate 

, / E\Y] - E\Z] - e\ 
R PE = 0.5 log M + - LJ ~^ LJ J. (6) 

If the sensor node has two modes: Sleep and Awake then the achievable rates can be improved. The 
sleep mode is a power saving mode in which the sensor only harvests energy and performs no other 
functions so that the energy consumption is minimal (which will be ignored). If E k < Z k then we assume 
that the node will sleep at time k. But to optimize its transmission rate it can sleep at other times also. 
We consider a policy called randomized sleep policy in [|20l . In this policy at each time instant k with 
E k > Z k the sensor chooses to sleep with probability p independent of all other random variables. We 



will see that such a policy can be capacity achieving in the present context. 
With the sleep option we will show that the capacity of this system is 



C= sup I(X;W), (7) 

p x :E[b(X)}<E[Y] 

where b(x) is the cost of transmitting x and equals 

{x 2 + a, if \x\ > 0, 
0, if|ac| = 0, 

and a = E[Z]. Observe that if we follow a policy that unless the node transmits, it sleeps, then b is the 
cost function. An optimal policy will have this characteristic. Denoting the expression in @ as C(E[Y]), 
we can easily check that C(.) is a non-decreasing function of E[Y). We also show below that C(.) is 
concave. These facts will be used in proving that @ is the capacity of the system. 

To show concavity, for si,s 2 > and < A < 1 we want to show that C(\s\ + (1 — A)s 2 ) > 
\C(si) + (1 — A)C(s 2 ). For Sj, let Ci be the capacity achieving codebook, i = 1,2. Use A fraction of 
time Ci and 1 — A fraction C 2 . Then the rate achieved is XC(si) + (1 — A)C(s 2 ) while the average energy 
used is Asi + (1 — A)s 2 . Thus, we obtain the inequality showing concavity. 

Theorem 2 For the energy harvesting system with processing energy, 



C= sup I(X;W) (8) 

Px :E[b(X)]<E[Y] 

is the capacity for the system. 
Proof: : See Appendix B. 

It is interesting to compute the capacity ([8]) and the capacity achieving distribution. Without loss of 
generality, the node sleeps with probability p, (0 < p < 1) and with probability (1 — p) the node transmits 
with a distribution F t (.). We can write the overall input distribution, F in (.), as a mixture distribution 

F m (-)=pu(.) + (l-p)F t (.), 

where u(.) denotes the unit step function. The corresponding output density function fw{-] Ft) — pIn{ ) + 
(1 — p) J fx(. — s)dF t (s), is the convolution of F in {.) and /n(-) where /W(-) i s A/"(0, cr 2 ). The mutual 



information I(X; W) in ([8]) can be written as 

I(F t )±I(X;W)=ph(0;F t ) + (l-p) J h(x; F t )dF t (x) - h(N), 

where h(N) is the differential entropy of noise N and h(x; F t ) is the marginal entropy function defined 
as 

h(x; F t ) = - J f N (w - x) \og(f w (w\ F t ))dw. 
Capacity computation can be formulated as a constrained maximization problem, 

sup I(F t ), (9) 

F t en 

where Q = {F t : F t is a cdf and / s 2 dF t (s) < (3 P } and (3 P = ^ - a. to is the space of all distribution 
functions with finite second moments and is endowed with the topology of weak* convergence. This 
topology is metrizable with Prohorov metric ([44]). It is easy to see that is a compact, convex topological 
space. The compactness of fl is a consequence of the second moment constraint of the distribution function 
which makes it tight and Helly's theorem. The objective function I(F t ) is a strictly concave map from 
to IR + , the positive real line. We can show that I(F t ) is a continuous function in the weak* topology 
and I (Ft) admits a weak derivative [|40l . Then there is a unique distribution F t0 that optimizes @. The 
weak derivative of I(F t ) with respect to F t at the optimum distribution F m is 

r Fto (F t )=ph(0;F tQ ) + (l-p) J h(x;F t0 )dF t (x) - h(N) - I(F to ). 

Here, I(F to ) is the capacity of the channel. Using KKT conditions we get sufficient and necessary 
conditions as I' FtQ (F t ) < and the conditions can be simplified using the techniques in 11401 , B31 as 



Q(x) = (1 - p)h(x; F t0 ) + K - Xx 2 < 0, Vx (10) 

and, 

(1 - p)h(x; F t0 ) + K — Xx 2 = 0, Vx G S , (11) 

where K = ph(0; F t0 ) — h(N) — I(F to ) + X(3 P , X > is the Lagrangian multiplier and S is the support 
set of the optimum distribution. 



The capacity achieving distribution is discrete and can be proved using the techniques provided in [40J 
and is omitted for brevity. The key steps of the proof include: 

• Identify the function Q(x) which gives a necessary and sufficient condition for optimality. 

• Show that Q(x) has an analytic extension Q(z) over the whole complex plane. 

• Prove by contradiction that the zero set of Q(z) cannot have limit points in its domain of definition 
and is at most countable. 

Since any mass point x of the optimum distribution function satisfies the condition Q(x) — the number 
of mass points of the optimum distribution is at most countable. 

Hence we find that the optimum input distribution is not Gaussian. To get further insight, consider 
{B k } to be iid binary random variables with P[B 1 — 0] = p — 1 — P\Bi = 1] and let {G k } be iid 
random variables with distribution F. Then X' k = B k G k is the capacity achieving iid sequence. Also, 

I(X' k ;X' k + N k ) = h(B k G k + N k )-h(N k ) 

= h{B k G k + N k )-h{B k G k + N k \B k ) + h{B k G k + N k \B k )-h{N k ) (12) 

= I(B k ; B k G k + N k ) + I(G k ; B k G k + N k \B k ) 

= I(B k ;B k G k + N k ) + (l-p)I(G k ;G k + N k ). (13) 

This representation suggests the following interpretation (and coding theoretic implementation) of the 
scheme: the overall code is a superposition of a binary ON-OFF code and an iid code with distribution 
F. The position of the ON (and OFF) symbols is used to reliably encode I(B; BG+N) bits of information 
per channel use, while the code with distribution F (which is used only during the ON symbols) reliably 
encodes (1 — p)I(G; G + N) bits of information per channel use. 

It is interesting to compare this result with the capacity in OTA . The capacity result in OTI is only the 



second term in ( [13] ) evaluated with G k being Gaussian. 

In Figj3] we compare the optimal sleep-wake policy, a sleep wake policy with F being mean zero 
Gaussian with variance _E[Y]/(1 —p) — a and no-sleep policy with the result in fl3TTl . We take E[Z] = 0.5 
and a 2 = 1. We see that when E[Y] is comparable or less than E[Z] then the node chooses to sleep 
with a high probability. When E\Y] >> E[Z] then the node will not sleep at all. Also it is found that 
when E[Y] < E[Z], the capacity is zero when the node does not have a sleep mode. However we obtain 
a positive capacity if it is allowed to sleep. When E[Y] >> E[Z], the optimal distribution F tends to a 
Gaussian distribution with mean zero and variance E[Y] — a. 



Sleep-Wake Policies: E[Z]=0.5 Noise variance=1 




Fig. 3. Comparison of Sleep Wake policies 



From the figure we see that our scheme improves the capacity provided in OTl . This is due to the 
embedded binary code and the difference is significant at low values of E\Y\. 

V. Achievable Rate with Energy Inefficiencies 

In this section we make our model more realistic by taking into account the inefficiency in storing 
energy in the energy buffer and the leakage from the energy buffer ( lfl"5ll ) for HSU architecture. For 
simplicity, we will ignore the energy Z k used for sensing and processing. 

We assume that if energy Y k is harvested at time k, then only energy (3{Y k is stored in the buffer and 
energy (3 2 gets leaked in each slot where < (3\ < 1 and < /3 2 < oo. Then Q become 

E k+1 = ((E k - T k ) - (3 2 ) + + ^Y k . (14) 

The energy can be stored in a supercapacitor and/or in a battery. For a supercapacitor, (5\ > 0.95 and for 
the Ni-MH battery (the most commonly used battery) 0i ~ 0.7. The leakage (3 2 for the battery is close 
to but for the super capacitor it may be somewhat larger. 

In this case, similar to the achievability of Theorem 1 we can show that 

^ .5io g ( 1+ Min^) (15) 

is achievable. This policy is neither capacity achieving nor throughput optimal IfTSll . An achievable rate 
of course is Q (obtained via HU). Now one does not even store energy and f3 2 are not effective. 



The upper bound 0.5 log(l + E[Y]/a 2 ) is achievable if Y is chi-square distributed with degree 1. Now, 
unlike in Section III, the rate achieved by the HU may be larger than (T5\ > for certain range of parameter 
values and distributions. 



Another achievable policy for the system with an energy buffer with storage inefficiencies is to use the 
harvested energy Yk immediately instead of storing in the buffer. The remaining energy after transmission 



is stored in the buffer. We call this Harvest-Use-Store (HUS) architecture. For this case, (14) becomes 



E k+1 = {(E k + fa(Y k - T k ) + - (T k - Y k ) + ) + - &)+. (16) 

Compute the largest constant c such that /3iE[(Y k — c) + ] > E[(c — Y k ) + ] + (3 2 . This is the largest c such 
that taking E[T k ] < c will make E k — > oo a.s. Thus, as in Theorem 1, we can show that rate 

Rhus = 0.5 log (l + (17) 

is achievable for this system. This is achievable by an input with distribution iid Gaussian with mean 
zero and variance c. 



Equation ( fl4| ) approximates the system where we have only rechargable battery while ( fT6| ) approximates 
the system where the harvested energy is first stored in a supercapacitor and after initial use transferred 
to the battery. 

When pi = 1, ($2 = we have obtained the capacity of this system in Section III. For the general case, 
its capacity is an open problem. 

We illustrate the achievable rates mentioned above via an example. 

Example 1 

Let {Y k } be iid taking values in {0.25, 0.5, 0.75, 1} with equal probability. We take the loss due to leakage, 
@2 = 0. In Figure [4] we compare the various architectures discussed in this section for varying storage 
efficiency We use the result in fl4~2l for computing the capacity in @. From the figure it can be seen 
that if the storage efficiency is very poor it is better to use the HU policy. This requires no storage buffer 
and has a simpler architecture. If the storage efficiency is good HUS policy gives the best performance. 
For Pi = 1, the HUS policy and HSU policy have the same performance. Thus if we judiciously use a 
combination of a supercapacitor and a battery, we may obtain a better performance. 




VI. Fading AWGN Channel 



In this section we extend the results of Theorem 1 to include fading. Rest of the notation is same as 
in Section III The model considered is given in Figure [5} 
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Fig. 5. The model 



The encoder receives a message S from the node and generates an n-length codeword to be transmitted 
on the fading AWGN channel. We assume flat, fast, fading. At time k the channel gain is H k and takes 
values in %. The sequence {H k } is assumed iid, independent of the energy generation sequence {Yk}. 
The channel output at time k is W k = H k X k + N k where X k is the channel input at time k and {N k } is iid 
Gaussian noise with zero mean and variance a 2 . The decoder receives Y n — (Yi, Y n ) and reconstructs 
S such that the probability of decoding error is minimized. Also, the decoder has perfect knowledge of 
the channel state H k at time k. 

If the channel input {X k } is AMS ergodic, then it can be easily shown that for the fading AWGN 



channel {(X k ,W k ), k > 0} is also AMS ergodic. Thus the channel capacity of the fading system is 
(133) 



C = sup /(X; W) = sup limsup -/(X™, W n ) 



Pi- 



(18) 



where under p x , {X n } is an AMS sequence, X n = (X 1 , ...,X n ) and the supremum is over all possible 
AMS sequences {X n }. For a fading AWGN channel, capacity achieving X k is zero mean Gaussian with 
variance T k where T k depends on the power control policy used and is assumed AMS. Then E[T) < E[Y) 
where E[T] is the mean of T under its stationary mean. The following theorem shows that one can find 
a sequence of codewords with code length n and rate R such that the average probability of error goes 



to zero as n — > oo if and only if R < C where C is given in ( |T9] ). 
Theorem 3 For the energy harvesting system with perfect CSIT, 

H 2 T*(H) 



C = 0.5 E 



H 



where 



T*(H) 



log(l + 



1 

#0 



1 

H 



(19) 



(20) 



and H is chosen such that E H [T*(H)} = E[Y}. 
Proof: See Appendix C. 

Thus we see that the capacity of this fading channel is same as that of a node with average power 
constraint E[Y] and the instantaneous power allocated is according to 'water filling' power allocation. 
The hard energy constraint of E k at time k does not affect its capacity. The capacity achieving signaling 
for our system is X k = sgn{X' k )mm{y/T*{H k )\X' k \,y/E^), where {X' k } is iid Af(0,l) and T*(H) is 



defined in (20). 



When no CSI is available at the transmitter (but perfect CSI is available at the decoder), take X k = 
sgn(X' k ) min(|X£,|, y/E~k) where {X' k } is iidAf(Q, E[Y]) and as in Theorem 1 this approaches the capacity 
of 0.5 £ H [log(l + H 2 E[Y]/a 2 )]. 

Similar to the non-fading case the throughput optimal policies in [Tl8l are related to the Shannon capacity 
achieving energy management policies provided here for the infinite buffer case. Also the capacity is the 
same as the maximum throughput obtained in the data-buffer case in |fT8ll . 

If there is no energy buffer to store the harvested energy then at time k only Y k energy is available. 
Thus X k is peak power limited to Y k . The capacity achieving distribution for an AWGN channel with 



peak power constraint Y k = y is not Gaussian. Let X(y, a 2 ) be a random variable with the capacity 
achieving distribution for an AWGN channel with peak power constraint y and noise variance a 2 . In 
general this distribution is discrete. Thus, if CSIT is exact then the transmitter will transmit X(y, a 2 /h 2 ) 
at time k when Y^ — y and H k = h. Therefore the ergodic capacity with information being available 
at the receiver is 0.5Eyh[I(X(Y, a 2 /H 2 ); W)}. If there is no CSIT then we can transmit X(y,a 2 ) and 
the corresponding capacity is 0.5Eyn[I(X(Y, o -2 ); W)]. 



A. Capacity with Energy Consumption in Sensing and Processing 



In this section we extend the results in Section IV to the fading case. 

First we extend the achievable policies given above to incorporate the energy consumption in activities 
other than transmission. We assume perfect CSIR for the channel state Hk at the time k. When there is 
perfect CSIT also, we use the signaling scheme X k = sgn{X' k ) mm(^T*(H k )\X' k \, ^/E~l), where {X' k } 
is iid Af(0, 1) and T*(H) is the optimum power allocation such that E[T*(H)\ = E[Y] - E[Z] - e. 
When no CSI is available at the transmitter, we use X k = sgn(X' k ) min(\X' k \, \fE~k) where {X' k } is iid 
W(0, E[Y] -E[Z]-e). The achievable rates for CSIT and no CSIT respectively are, 



Rpe-csit — 0.5 E H 



log 1 + 



H 2 T*(H) 
1 + ^ 



a- 1 



RpE-NCSIT — 0.5 E H 



(21) 
(22) 



When Sleep Wake modes are supported the achievable rates can be improved as in Section [TV] 
Theorem 4 Let V(H) be the set of all feasible power allocation policies such that for P{H) 6 V(H), 

E H [P(H)] < E[Y]. For the energy harvesting system with processing energy transmitting over a fading 

Gaussian channel, 



C= sup sup E[I(X;W)} (23) 

P(H)EV{H) p x :E[b{X)]<P(H) 

is the capacity for the system. 
Proof: : See Appendix D. 

We compute the capacity ([23]) and the capacity achieving distribution. Let P*{h) be the power allocated 
in state h. Without loss of generality, under H = h, the node sleeps with probability p, (0 < p < 1) and 



with probability (1 — p) the node transmits with a distribution F t (.). As in Section IV, we can show using 
KKT conditions that the capacity achieving distribution for state H = h is discrete and the number of 
mass points are at most countable with E[b(X)] < P(h). As in the case without fading the distribution 
F t (.) under H = h is not Gaussian. 



The optimal power allocation policy P*(H) that maximizes ( |23] ) is not 'water filling' but similar and 
uses more power when the channel is better. 
Example 2 

Let the fade states take values in {0.5, 1, 1.2} with probabilities {0.1, 0.7, 0.1}. We take a = E[Z] = 
0.5, a 2 = 1. We compare the capacity for the cases with perfect and no CSIT when there is no sleep 
mode supported (Equation pTj ), ([6])) and with the optimal sleep probability in Figure [6] 
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Fig. 6. Comparison of sleep wake policies 

From the figure we observe that 

• The randomized sleep wake policy improves the rate significantly when E[Y] < E[Z]. 

• The sensor node chooses not to sleep when E[Y] >> E[Z\. 

B. Achievable Rate with Energy Inefficiencies 

In this section we take into account the inefficiency in storing energy in the energy buffer and the 
leakage from the energy buffer. The notation is same as in Section |Vj 
The energy evolves as 



E k+l = ((E k -T k )-f3 2 )+ + p 1 Y k . 



(24) 



In this case, similar to the achievability of Theorem 3 we can show that the rates 

H 2 (^E[Y]-^ 2 



Rs-ncsit — 0.5 Eh 
Rs-csit = 0.5 Eh 



log 1 



log 1 



H\^T(H) - 2 ) 



a- 1 



(25) 
(26) 



are achievable in the no CSIT and perfect CSIT case respectively, where T(H) is a power allocation policy 



such that ([26]) is maximized subject to E H [T(H)] < E[Y). This policy is neither capacity achieving nor 
throughput optimal |fT8i 

An achievable rate when there is no buffer and perfect CSIT is 



C = E YH [I(X(Y,H);W)}, 



(27) 



where X(y, h) is the distribution that maximizes the capacity subject to peak power constraint y and fade 
state h. A numerical method to evaluate the capacity with peak power constraints is provided in 11401 . It 
is also shown in [42J that for y/y < 1.05, the capacity has a closed form expression 



C(y) = y 



e x ' 2 / 2 log cosh(y — y/yx) 



dx. 



(28) 



When there is no buffer and no CSIT the distribution that maximizes the capacity cannot be chosen 



as in ( [27] ) and the capacity is less than the capacity given in ([27]). The capacity in ( [27] ) is without using 
buffer and hence /?i and f3 2 do not affect the capacity. Hence unlike in Section III, ([27]) may be larger 



than (25) and (26) for certain range of parameter values. We will illustrate this by an example. 



For the Harvest-Use-Store (HUS) architecture, (24) becomes 



E k+1 = ((E k + p x {Y k - T k ) + - (T k - Y k 



(29) 



Find the largest constant c such that /3iE[(Y k - c) + ] > E[(c - Y k ) + ] + /3 2 . Of course c < E[Y}. When 
there is no CSIT, this is the largest c such that taking T k = min(c — 5,E k ), where 5 > is any small 
constant, will make E k — >■ oo a.s. and hence T k — > c a.s. Then, as in Theorem 3, we can show that 

H 2 c ~ 



R 



us- 



-ncsit — 0.5 Eh 



log 1 + 



(30) 



is an achievable rate. 



When there is perfect CSIT, 'water filling' power allocation can be done subject to average power 
constraint of c and the achievable rate is 



Rus-csit — 0.5 Eh 



log 1 + 



H 2 T*(H) 



a- 



(31) 



where T*(H) is the 'water filling' power allocation with E[T*(H)] = c. 
We illustrate the achievable rates mentioned above via an example. 
Example 3 

Let the process {Y k } be iid taking values in {0.5, 1} with probability {0.6, 0.4} . We take the loss due 
to leakage (3 2 = 0. The fade states are iid taking values in {0.4, 0.8, 1} with probability {0.4, 0.5, 0.1}. 
In Figure [7] we compare the various architectures discussed in this section for varying storage efficiency 



/?!. The capacity for the no buffer case with perfect CSIT is computed using equations ( [28] ) and ([27]). 
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Fig. 7. Rates for various architectures 

From the figure we observe 

• Unlike the ideal system, the HSU (which uses infinite energy buffer) performs worse than the HU 
(which uses no energy buffer) when storage efficiency is poor for the perfect CSIT case. 

• When storage efficiency is high, HU policy performs worse compared to HSU and HUS for perfect 
CSIT case. 

. HUS performs better than HSU for No/Perfect CSIT. 

• For (3 = 1, the HUS policy and HSU policy are the same for both perfect CSIT and no CSIT. 

• The availability of CSIT and storage architecture plays an important role in determining the achievable 
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Fig. 8. Model of an energy harvesting point to point channel. 



rates. 



VII. Combining Information and Queuing Theory 

In this section we consider a system with both energy and data buffer, each with infinite capacity 
(see Fig. [8]). We consider the simplest case: no fading, no battery leakage and storage inefficiencies. The 
system is slotted. During slot k (defined as time interval [k, k + 1], i.e., a slot is a unit of time), A k bits 
are generated. Although the transmitter may generate data as packets, we allow arbitrary fragmentation 
of packets during transmission. Thus, packet boundaries are not important and we consider bit strings (or 
just fluid). The bits A k are eligible for transmission in (k + l)st slot. The queue length (in bits) at time 
k is q k . We assume that transmission consumes most of the energy at the transmitter and ignore other 
causes of energy consumption. We denote by E k the energy available at the node at time k. The energy 
harvesting source is able to replenish energy by Y k in slot k. 

In slot k we will use energy 

T k = mm(E k ,E[Y]-e), (32) 

where e is a small positive constant. It was shown in lfT8l that such a policy is throughput optimal (and 
it is capacity achieving in Theorem 1). 

There are n channel uses (mini slots) in a slot, i.e., the system uses an n length code to transmit the data 
in a slot. The length n of the code word can be chosen to satisfy certain code error rate. The slot length n 
and R k are to be appropriately chosen. We use codewords of length n and rate R k = 0.5 log(l + T k /na 2 ) 
in slot k with the following coding and decoding scheme: 

1) An augmented message set {1, 2 nRk ] U {0}. 

2) An encoder that assigns a codeword x n (m) to each m 6 {1, 2 ni?fe }U{0} where x n (m) is generated 
as an iid sequence with distribution J\f(0,T k /n — 5i) and 5\ > is a small constant. The codeword x n (m) 



is retained if it satisfies the power constraint Yll x l — Tk- Otherwise error message is sent. 

3) A decoder that assigns a message rh G {1, ...,2 nRk } U {0} to each received sequence w n in a slot 
such that (x n (m),w n ) is jointly typical and there is no other x n (m') jointly typical with w n . Otherwise 
it declares an error. 

In slot k, nR k bits are taken out of the queue if q k > nR k . The bits are represented by a message 
m k £ {1, 2 nRk ] and x n (m k ) is sent. If q k < nR k no bits are taken out of the queue and "0 message" 
x n (0) is sent. 

Hence the processes {E k } and {q k } satisfy 

Qk+l = Qk— n RkI{q k >nR k } + A k , (33) 

E k+1 = {E k -T k ) + Y k . (34) 



With T k in ([32>, E k -> oo a.s. and T k ->■ £[Y] - e a.s. Also, = 0.5 log(l + £g) -> 0.5 log(l + 
g 2~ £ )- Thus we obtain 

Theorem 5. The random data arrival process {A k } can be communicated with arbitrarily low average 
probability of block error, by an energy harvesting sensor node over a Gaussian channel with a stable 
queue if and only if E[A] < 0.5n log(l + fg 1 ). ■ 

In Theorem 5 'stability' of the queue has the following interpretation. If {A k } is stationary, ergodic 
then P[q k — > oo] = and with probability 1, {q k } visits the set {q : q < nR} infinitely often. Also 



the sequence {q k } is tight ([46]). If {A k } is iid then {q k , E k } is a Markov chain. With T k in ( [32] ), 
asymptotically, T k — > E[Y] — e a.s. and we can ignore the E k component of the process and think of 
{q k } as a Markov chain with T k = E[Y] — e. It has a finite number of ergodic sets. The process {q k } 
eventually enters one ergodic set with probability 1 and then approaches a stationary distribution. If {q k } is 
irreducible and aperiodic then {q k } has a unique stationary distribution and {q k } converges in distribution 
to it irrespective of initial conditions. 

Although the capacity achieved in each slot is as per Theorem 1, the set-up used here is somewhat 
different. In Theorem 1, the time scale of the dynamics of the energy process {E k } is mini slots, but 
in this section we have taken it at the time scale of slots (which one is the right model depends on the 
system under consideration). Thus, in Theorem 1 we used the theoretical tool of AMS sequences. But in 
our present setup, in a slot we can use X 1 ,X 2 , ■■■■,X n iid Gaussian M(0,T k /n — 5) and use a codeword 
only if it satisfies X\ + .... + X\ < T k and q k > nR k ; otherwise an error message is sent. Of course, if 



the physical system demands that we should use for the energy dynamics the time scale of a channel use 
then we can use the framework of Theorem 1 . 

VIII. Finite Buffer 

In this section we find achievable rates when the sensor node has a finite buffer to store the harvested 
energy. This case is of more practical interest. We consider the simplest case: no fading, no battery leakage 
and storage inefficiencies and no data queue. The node has an energy buffer of size T < oo. By this we 
mean that the energy buffer can store a finite number of energy units of interest. 

We use the HUS architecture where the energy harvested is used and only the left over energy is stored. 
The energy available at the buffer at time k is denoted by E k . At time k, the node uses energy T k with 
Tk < E k + Y k — Ek. We assume that E k and Y k take values in finite alphabets. Also, {Yk}k>i is assumed 
iid. 

We assume that the buffer state information (BSI), E k , is perfectly available at the encoder and the 
decoder at time k. X k denotes the codeword symbol used at time k and X\ < T k . Of course T k < E k 
and E k < Y. In general T k is a function of E , E k . . An easily tractable class of energy management 
policies is 

T k = h(E k ), (35) 

where h defines the energy management policy. The codeword symbol X k is picked with a distribution 
that maximizes the capacity of a Gaussian channel with peak power constraint T k (we quantize this such 
that {E k } takes values in a finite alphabet). Hence the process {E k } k >i satisfies, 

E k+1 = (E k - X 2 k ) + Y k+1 (36) 

and is a finite state Markov chain with the transition matrix decided by h. If E = then the Markov 
chain will either enter only one ergodic set or possibly in a finite number of disjoint components which 
depend on h. If I* and I denote the Pinsker and Dobrushin information rates ([30]), since we have finite 
alphabets, I* — I. In particular, 

r(X]W)=l(X]W)= lim -I(X {n) -W in) ). (37) 

v ' n->oo n 

Also, Asymptotic Equi-partition Property holds for {X k , W k }. 



The following theorem provides achievable rates. 

Theorem 6: A rate R is achievable if an energy management policy h exists, such that R < l[X; W).\ 



The proof is similar to the achievability proof given in Theorem 1. The rates (|37J) can be computed via 
algorithms available in ll47l and ll48l . Using stochastic approximation ( 11491 ) we can obtain the Markov 



chains that optimize If initial energy E Q is not zero, then the Markov chain can enter some other 
ergodic sets and the achievable rates can be different. If h is such that {E k } is an irreducible Markov 
chain then the achievable rates will be independent of the initial state E . 

Theorem 6 can be generalized to include the case where {(E k ,X k )} is a k— step finite state Markov 
chain. In fact if {(Ek, X k )} is a general AMS ergodic finite alphabet sequence then AEP holds and /* = /. 
Thus, R < \im n ^ OQ n- 1 I(X n ; W n ) is achievable. 

The capacity of our system can be written as ( Il50l0 

I T)(x n W n ) 

C = sup p — liminf —log— — — -, (38) 
n p(x n )p(w n ) 

where p — lim inf is defined in ll50l and sup is over all input distributions X n which satisfy the energy 
constraints X\ < E k for all k > 0. An interesting open problem is: can ( [38] ) be obtained by limiting 
{X n } to AMS ergodic sequences mentioned above? 

The achievable rates when the decoder has only partial information about Ek can be handled as for the 
system with no buffer and partial BSI, studied in Section III. 

Example 4 

We consider a system with a finite buffer with T = 15 units in steps of size 1. The Y k process has 
three mass points and provided in Table 1. We compute the optimal achievable rate using simultaneous 
perturbation stochastic approximation algorithm [49]. The achievable rate is also compared with a greedy 
policy, where the rate is evaluated using algorithms provided in [|47l and [|48l . In the greedy policy, at any 
instant k, an optimum distribution for an AWGN channel peak amplitude constrained to \fE~ k = 
is used. We have also obtained the optimal rates using a 1-step Markov policy ( [33] ) where the optimal 
Markov chain is obtained via stochastic approximation. Then achievable rates are compared with the 
capacity with infinite buffer and no-buffer in Figure [9} 

From the figure we observe that, for a given buffer size, the greedy policy is close to optimal at higher 
E[Y]. Also, the optimal achievable rates for finite buffer case are close to the capacity for infinite buffer 
for small E[Y] but becomes close to the greedy at high E[Y]. 



TABLE I 

Y k Process 



E(Y) 


Mass points 


Probabilities 








1 


1.0141 


1 2 


0.3192 0.3474 0.3333 


2.1031 


1 2 3 


0.2303 0.4364 0.3333 


3.3078 


2 3 4 


0.1794 0.3333 0.4872 


4.1990 


3 4 5 


0.2338 0.3333 0.4329 


5.0854 


4 5 6 


0.3333 0.2479 0.4188 


5.8738 


5 6 7 


0.3964 0.3333 0.2703 


6.7168 


6 7 8 


0.4749 0.3333 0.1917 


8.2533 


7 8 9 


0.2067 0.3333 0.4600 


9.0332 


8 9 10 


0.3167 0.3333 0.3499 


9.9136 


9 10 11 


0.3333 0.4198 0.2469 
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Fig. 9. Acheivable rate for finite buffer 



IX. Conclusions 

In this paper the Shannon capacity of an energy harvesting sensor node transmitting over an AWGN 
Channel is provided. It is shown that the capacity achieving policies are related to the throughput optimal 
policies. Also, the capacity is provided when energy is consumed in activities other than transmission. 
Achievable rates are provided when there are inefficiencies in energy storage. We extend the results to the 
fast fading case. We also combine the information theoretic and queuing theoretic formulations. Finally 
we also consider the case when the energy buffer is finite. 
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Appendix A 
Proof of Theorem 1 



Codebook Generation : Let {X' k } be an iid Gaussian sequence with mean zero and variance E[Y] — e 
where e > is an arbitrarily small constant. For each message s E {1, 2, 2 nR }, generate n length 
codewords according to the iid distribution jV(0, E[Y) — e). Denote the codeword by X' n (s). Disclose 
this codebook to the receiver. 

Encoding: When S = s, choose the channel codeword to be X k = sgn(X' k (s)) mm(y/E~k,\X' k (s)\) 
where sgn(x) = 1 if x > and = -1 if x < 0. Then T k = X\ < E k and E[T k ] = E[X^} < E[Y] - e. 
Thus, from standard results on G/G/l queues (OH, chapter 7) E k — > oo a.s. and hence \X k — X' k \ — > a.s. 
Also (X k , W k ) converges almost surely(a.s.) to a random variable with the distribution of (X' k , W' k ) and 
{{X k , W k , X' k )} is AMS ergodic where W' k = X' k + N k . 

Decoding: The decoder obtains W n and finds the codeword X' n (s) such that (X' n (s), W n ) G T™ where 
T" is the set of weakly e-typical sequences of the joint AMS ergodic distribution Px'w- If it is a unique 
s then it declares s as the message transmitted; otherwise declares an error. 

Analysis of error events 

Let s has been transmitted. The following error events can happen 

El: {(X /n 0), W n ) £ T £ n }. The probability of event El goes to zero as, {X' k , W k } is AMS ergodic and 
AEP holds for AMS ergodic sequences ( Il52l0 . as {X' k , W k } has a density with respect to iid Gaussian 
measure on an appropriate Euclidean space. 

E2: There exist s ^ s such that {(X' n (s), W n ) e T e n }. Let H(X'), H(W) be the entropy rates of 
{X' k } and {W k }. Next we show that P{E 2 ) -> as n -> oo. We have 



P(E2) = ^P((X' n {s),W n ) e T e n ) 

(x n ,u; n )eT e n 

< 2- nR P{x' n )P(w' n ) 

< \^n^2-nR2-(nH{X')-e)2-{nH{W')-e) 

< 2( n H(X\W')+E)2-nR2-{nH(X')-e)2-(nH(W')-e) 



Therefore, P{E2) ->■ and n -> oo if i2 < W) = 0.5 log(l + P/fx 2 ). 

Converae Par?: For the system under consideration - J2 k=1 — n 5^fc=i ^* ~~ ^IX] a - s - Hence, if 
{Xfc(s), = l,...,n} is a codeword for message s E {l,...,2 nR } then for all large n we must have 
n Sfc=i X k (s) 2 < E[Y] + 5 with a large probability for any 5 > 0. Hence by the converse in the AWGN 
channel case, limsup^^ ^I(X n ; W n ) < 0.5 log(l + (E[Y) + 5) /a 2 ). Now take 5^0. 

Combining the direct part and converse part completes the proof. ■ 

Appendix B 
Proof of Theorem 2 

Codebook Generation : For each message s E {1,2, ...,2 nR }, generate n length codewords according 
to an iid distribution p' x with constraint E[b(X')] = E[Y] — e, where e > is a small constant. Denote 
the codeword by X' n (s). Disclose this codebook to the receiver. 

Encoding : When S = s, choose the channel codeword as 

(min{^( S ), y/(E k - Z k )+}, if X' k > 0, 
X k (s) = t 

[max{X' k (s),-y/(E k - Z k )+}, if X' k < 0. 
Then to transmit X k (s) we need energy T k = (X% + Z k )l {Xk ^o} and E k+1 = (E k - T k ) + Y k . Also, 

E[T k ] = E[X 2 k }+E[Z k }P{X k ^0} 

< E[X' 2 ] + aP{X' k ^0} = E[Y]-e. 

Thus, from standard results on G/G/l queues ( il5T1| . chapter 7) E k — > oo a.s. and hence \X k — X' k \ — > 
a.s. Also finite dimensional distributions of {(X m+k , W m+k , X' m+k ), k > 0} converge a.s. to that of 
{(X' k ,W k ,X' k )}. Thus {(X k ,W k ,X' k )} is AMS ergodic with limiting distribution (X k ,W' k ,X' k ) where 
W' k = X' k + N k . Furthermore the energy constraints are also met. 

If the chosen codeword is e— weakly typical and XT=i K x i)/ n — E\Y] ~ e > then transmit it; otherwise 
send an error message. The probability that an error message is sent goes to zero as n — > oo. 

Decoding: The decoder obtains W n . If it finds a unique codeword X' n (s) such that {(X' n (s),W n ) E 
T e n } where, T™ is the set of e-typical sequence for the distribution Px'w> it declares s as the transmitted 
message. Otherwise it declares an error. 



By the usual methods as in Theorem 1 with the above coding-decoding scheme and also the fact that 
C(.) is non-decreasing, we can show that the probability of error for this scheme goes to zero as n — > oo. 
Thus we can achieve the capacity ([8]). 

Converse: The converse follows via Fano's inequality as in Theorem 1. For that proof to hold here, we 
need that C(.) is concave ■ 

Appendix C 
Proof of Theorem 3 



Achievability: Let T' k = T*(H k ) with T* defined in @ with E[T*(H)} = E[Y] - e where e > is a 
small constant. Since {H k } is iid, {T' k } is also iid. We take T k = min(E k ,T' k ). Thus, as in the proof of 
Theorem 1, from standard results on G/G/l queues f JI5lT| . chapter 7) E k — > oo a.s. Therefore, as T*(H) 
is upper bounded, lim^oo sup fc>n \T k — T*(H k )\ — >■ a.s. 

Let {X' k } be iid Gaussian with mean zero and variance one. The channel codeword X k = 
sgn{X k )vcan(TjTk\X k \,y/E~k) where sgn(x) — 1, if x > and —1 otherwise. This is an AMS ergodic 



sequence with the stationary mean being the distribution of ^T*{H k )X' k . Then since AWGN channel 
under consideration is AMS ergodic (ED), (X, W) = {{X k , W k ), k > 1} is AMS ergodic. 

By using the techniques in Theorem 1, R < I(X, W) = 0.5 E H [fog(l + H 2 T* (H)) / a 2 )}. 

Converse Part: Let there be a sequence of codebooks for our system with rate R and average probability 
of error going to as n — > oo. If {X k (s), k = l,...,n} is a codeword for message s E {l,...,2 nR } 
then \/n J2k=i X k (s) 2 < l/nJ2 k =i Y k < E[Y] + 5 for any S > with a large probability for all n large 
enough. Hence by the converse in the fading AWGN channel case (123), R < lim sup^^ I(X k ; W k ) fk < 



0.5 E H [\og(l + H 2 T*(H)/a 2 )] for T*(H) given in (jgOj) 



Combining the direct and the converse part completes the proof. ■ 

Appendix D 
Proof of Theorem 4 

Fix the power allocation policy P*. Under P*(h), the achievability of sup pa . : _ B r 6 / J ni<p*(M I(X; W), 
whenever H k = h, is proved using the techniques provided in Theorem 2 for the non-fading case. 
Using this along with finding the expectation w.r.t. the optimum power allocation scheme completes the 
achievability proof. 

The converse follows via Fano's inequality. ■ 
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